---
title: Query Plan Caching
subtitle: Configure in-memory and distributed caching for query plans
description: Configure query plan caching to improve router performance by storing generated query plans in memory or Redis.
---

Whenever your router receives an incoming GraphQL operation, it generates a [query plan](/graphos/resources/federation/query-plans) to determine which subgraphs it needs to query to resolve that operation.

By caching previously generated query plans, your router can skip generating them again if a client later sends the exact same operation—improving your router's responsiveness.

## Performance improvements vs. stability

The router is a highly scalable and low-latency runtime. Even with all caching **disabled**, the time to process operations and query plans is minimal (nanoseconds to milliseconds) when compared to the overall supergraph request, except in edge cases of extremely large operations and supergraphs.

Caching offers stability to those running a large graph so your overhead for given operations stays consistent, not that it dramatically improves. To validate the performance wins of operation caching, check out the [traces and metrics in the router](/graphos/routing/observability/telemetry/instrumentation/standard-instruments#performance) to take measurements before and after.

In extremely large edge cases though, the cache can save 2-10x time to create the query plan, which is still a small part of the overall request.

## In-memory caching

GraphOS Router enables query plan caching by default using an in-memory LRU cache. In your router's [YAML config file](/graphos/routing/configuration/overview#yaml-config-file), you can configure the maximum number of query plan entries in the cache:

```yaml title="router.yaml"
supergraph:
  query_planning:
    cache:
      in_memory:
        limit: 512 # This is the default value.
```

### Cache warm-up

When loading a new schema, a query plan might change for some queries, so cached query plans cannot be reused.

To prevent increased latency upon query plan cache invalidation, the router precomputes query plans for the most used queries from the cache when a new schema is loaded.

Precomputed plans are cached before the router switches traffic over to the new schema.

<Tip>

You can also send the header `Apollo-Expose-Query-Plan: dry-run` for [generating query plans at runtime](https://www.apollographql.com/docs/graphos/resources/federation/query-plans#outputting-query-plans-with-headers), which you can use to warm up your cache instances with a custom-defined operation list.

</Tip>

By default, the router warms up the cache with 30% of the queries already in cache, but you can configure it as follows:

```yaml title="router.yaml"
supergraph:
  query_planning:
    # Pre-plan the 100 most used operations when the supergraph changes
    warmed_up_queries: 100
```

In addition, the router can use the contents of the [persisted query list](/graphos/routing/security/persisted-queries) to prewarm the cache. By default, it does this when loading a new schema but not on startup; you can [configure](/graphos/routing/security/persisted-queries#experimental_prewarm_query_plan_cache) it to change either of these defaults.

#### Cache warm-up with headers

<MinVersionBadge version="Router v1.61.0" />

With router v1.61.0+ and v2.x+, if you have enabled exposing query plans via `--dev` mode or `plugins.experimental.expose_query_plan: true`, you can pass the `Apollo-Expose-Query-Plan` header to return query plans in the GraphQL response extensions. You must set the header to one of the following values:

- `true`: Returns a human-readable string and JSON blob of the query plan while still executing the query to fetch data.
- `dry-run`: Generates the query plan and aborts without executing the query.

After using `dry-run`, query plans are saved to your configured cache locations. Using real, mirrored, or similar-to-production operations is a great way to warm up the caches before transitioning traffic to new router instances.

### Monitoring cache performance

To get more information on the planning and warm-up process, use the following metrics (where `<storage>` can be `redis` for distributed cache or `memory`):

#### Counters

- `apollo.router.cache.hit.time.count{kind="query planner", storage="<storage>"}`
- `apollo.router.cache.miss.time.count{kind="query planner", storage="<storage>"}`

#### Histograms

- `apollo.router.query_planning.plan.duration`: time spent planning queries
  - `planner`: The query planner implementation used (`rust` or `js`)
  - `outcome`: The outcome of the query planning process (`success`, `timeout`, `cancelled`, `error`)
- `apollo.router.schema.load.duration`: time spent loading a schema
- `apollo.router.cache.hit.time{kind="query planner", storage="<storage>"}`: time to get a value from the cache
- `apollo.router.cache.miss.time{kind="query planner", storage="<storage>"}`

#### Gauges

- `apollo.router.cache.size{kind="query planner", storage="memory"}`: current size of the cache (only for in-memory cache)
- `apollo.router.cache.storage.estimated_size{kind="query planner", storage="memory"}`: estimated storage size of the cache (only for in-memory query planner cache)

To define the right size of the in-memory cache, monitor `apollo.router.cache.size` and the cache hit rate. Then examine `apollo.router.schema.load.duration` and `apollo.router.query_planning.plan.duration` to decide how much time to spend warming up queries.

## Distributed caching with Redis

<PlanRequired plans={["Free", "Developer", "Standard", "Enterprise"]}>

Rate limits apply on the Free plan.
Performance pricing applies on Developer and Standard plans.
Developer and Standard plans require Router v2.6.0 or later.

</PlanRequired>

If you have multiple GraphOS Router instances, those instances can share a Redis-backed cache for their query plans. This means that if _any_ of your router instances caches a particular value, _all_ of your instances can look up that value to significantly improve responsiveness.

### Prerequisites

To use distributed caching:

- You must have a Redis cluster (or single instance) that your router instances can communicate with.
- You must have a [GraphOS Enterprise plan](https://www.apollographql.com/pricing/) and [connect your router to GraphOS](/graphos/routing/configuration/overview#environment-variables).

### How it works

Whenever a router instance requires a query plan to resolve a client operation:

1. The router instance checks its own [in-memory cache](#in-memory-caching) for the required value and uses it if found.
2. If _not_ found, the router instance then checks the distributed Redis cache for the required value and uses it if found. It also then replicates the found value in its own in-memory cache.
3. If _not_ found, the router instance _generates_ the required query plan.
4. The router instance stores the obtained value in both the distributed cache _and_ its in-memory cache.

### Redis URL configuration

The distributed caching configuration must contain one or more URLs using different schemes depending on the expected deployment:

- `redis` — TCP connected to a centralized server.
- `rediss` — TLS connected to a centralized server.
- `redis-cluster` — TCP connected to a cluster.
- `rediss-cluster` — TLS connected to a cluster.
- `redis-sentinel` — TCP connected to a centralized server behind a sentinel layer.
- `rediss-sentinel` — TLS connected to a centralized server behind a sentinel layer.

The URLs must have the following format:

#### One node

```
redis|rediss :// [[username:]password@] host [:port][/database]
```

Example: `redis://localhost:6379`

#### Clustered

```
redis|rediss[-cluster] :// [[username:]password@] host [:port][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]]
```

or, if configured with multiple URLs:

```
[
  "redis|rediss[-cluster] :// [[username:]password@] host [:port]",
  "redis|rediss[-cluster] :// [[username:]password@] host1 [:port1]",
  "redis|rediss[-cluster] :// [[username:]password@] host2 [:port2]"
]
```

#### Sentinel

```
redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]
                            [&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]
```

or, if configured with multiple URLs:

```
[
  "redis|rediss[-sentinel] :// [[username:]password@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]",
  "redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]"
]
```

### Router configuration

<Tip>

In your router's YAML config file, **you should specify your Redis URLs via environment variables and [variable expansion](/graphos/routing/configuration/overview#variable-expansion)**. This prevents your Redis URLs from being committed to version control, which is especially dangerous if they include authentication information like a username and/or password.

</Tip>

<Caution>

Cached query plans are not evicted on schema refresh, which can quickly lead to distributed cache overflow when combined with [cache warm-up](#cache-warm-up) and frequent schema publishes.

Test your cache configuration with expected queries and consider decreasing the [TTL](#ttl) to prevent cache overflow.

</Caution>

To enable distributed caching of query plans, add the following to your router's [YAML config file](/graphos/routing/configuration/overview#yaml-config-file):

```yaml title="router.yaml"
supergraph:
  query_planning:
    cache:
      redis:
        urls: ["redis://..."]
```

The value of `urls` is a list of URLs for all Redis instances in your cluster.

All query plan cache entries will be prefixed with `plan.` within the distributed cache.

### Redis configuration options

```yaml title="router.yaml"
supergraph:
  query_planning:
    cache:
      redis:
        urls: ["redis://..."]
        username: admin/123 # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the username in the URL
        password: admin # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the password in the URL
        timeout: 2s # Optional, by default: 500ms
        ttl: 24h # Optional
        namespace: "prefix" # Optional
        #tls:
        required_to_start: false # Optional, defaults to false
        reset_ttl: true # Optional, defaults to true
        pool_size: 4 # Optional, defaults to 1
```

#### Timeout

Connecting and sending commands to Redis have a timeout of 500ms by default, which you can override.

#### TTL

The `ttl` option defines the default global expiration for Redis entries. For query plan caching, the default expiration is set to 30 days.

When enabling distributed caching, consider how frequently you publish new schemas and configure the TTL accordingly. When new schemas are published, the router [pre-warms](#cache-warm-up) the in-memory and distributed caches but doesn't invalidate existing cached query plans in the distributed cache, creating an additive effect on cache utilization.

To prevent cache overflow, consider decreasing the TTL to 24 hours or twice the median publish interval (whichever's lesser), and monitor cache utilization in your environment, especially during schema publish events.

Also note that when [cache warm-up](#cache-warm-up) is enabled, each router instance will warm the distributed cache with query plans from _its own in-memory cache_. In the worst case, a schema publish will increase the number of query plans in the distributed cache by the number of router instances multiplied by the number of warmed-up queries per instance, which may noticeably increase the total cache utilization.

<Tip>
Be sure to test your configuration with expected queries and during schema publish events to understand the impact of distributed caching on cache utilization.
</Tip>

#### Namespace

When using the same Redis instance for multiple purposes, the `namespace` option defines a prefix for all the keys defined by the router.

#### TLS

For Redis TLS connections, you can set up a client certificate or override the root certificate authority by configuring `tls` in your router's [YAML config file](/graphos/routing/configuration/overview#yaml-config-file). For example:

```yaml
supergraph:
  query_planning:
    cache:
      redis:
        urls: ["rediss://redis.example.com:6379"]
        tls:
          certificate_authorities: ${file./path/to/ca.crt}
          client_authentication:
            certificate_chain: ${file./path/to/certificate_chain.pem}
            key: ${file./path/to/key.pem}
```

#### Required to start

When active, the `required_to_start` option will prevent the router from starting if it cannot connect to Redis. By default, the router will still start without a connection to Redis, which would result in only using the in-memory cache for query planning.

#### Reset TTL

When this option is active, accessing a cache entry in Redis will reset its expiration.

#### Pool size

The `pool_size` option defines the number of connections to Redis that the router will open. By default, the router will open a single connection to Redis. If there is a lot of traffic between router and Redis and/or there is some latency in those requests, it is recommended to increase the pool size to reduce that latency.

### Cache warm-up with distributed caching

If the router uses distributed caching for query plans, the warm-up phase also stores the new query plans in Redis. Since all router instances might have the same distributions of queries in their in-memory cache, the list of queries is shuffled before warm-up, so each router instance can plan queries in a different order and share their results through the cache.
